NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ML-NIC: accelerating machine learning inference using smart network interface cards

https://doi.org/10.3389/fcomp.2024.1493399

Kapoor, Raghav; Anastasiu, David C; Choi, Sean (January 2025, Frontiers in Computer Science)

Low-latency inference for machine learning models is increasingly becoming a necessary requirement, as these models are used in mission-critical applications such as autonomous driving, military defense (e.g., target recognition), and network traffic analysis. A widely studied and used technique to overcome this challenge is to offload some or all parts of the inference tasks onto specialized hardware such as graphic processing units. More recently, offloading machine learning inference onto programmable network devices, such as programmable network interface cards or a programmable switch, is gaining interest from both industry and academia, especially due to the latency reduction and computational benefits of performing inference directly on the data plane where the network packets are processed. Yet, current approaches are relatively limited in scope, and there is a need to develop more general approaches for mapping offloading machine learning models onto programmable network devices. To fulfill such a need, this work introduces a novel framework, called ML-NIC, for deploying trained machine learning models onto programmable network devices' data planes. ML-NIC deploys models directly into the computational cores of the devices to efficiently leverage the inherent parallelism capabilities of network devices, thus providing huge latency and throughput gains. Our experiments show that ML-NIC reduced inference latency by at least 6 × on average and in the 99th percentile and increased throughput by at least 16xwith little to no degradation in model effectiveness compared to the existing CPU solutions. In addition, ML-NIC can provide tighter guaranteed latency bounds in the presence of other network traffic with shorter tail latencies. Furthermore, ML-NIC reduces CPU and host server RAM utilization by 6.65% and 320.80 MB. Finally, ML-NIC can handle machine learning models that are 2.25 × larger than the current state-of-the-art network device offloading approaches.
more » « less
Full Text Available
EXCALIBUR: Encouraging and Evaluating Embodied Exploration

https://doi.org/10.1109/CVPR52729.2023.01434

Zhu, Hao; Kapoor, Raghav; Min, So Yeon; Han, Winson; Li, Jiatai; Geng, Kaiwen; Neubig, Graham; Bisk, Yonatan; Kembhavi, Aniruddha; Weihs, Luca (June 2023, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition)

Full Text Available
Heuristic and Cost-based Optimization for Diverse Provenance Tasks

https://doi.org/10.1109/TKDE.2018.2827074

Niu, Xing; Kapoor, Raghav; Glavic, Boris; Gawlick, Dieter; Liu, Zhen Hua; Krishnaswamy, Vasudha; Radhakrishnan, Venkatesh (April 2018, IEEE Transactions on Knowledge and Data Engineering)

A well-established technique for capturing database provenance as annotations on data is to instrument queries to propagate such annotations. However, even sophisticated query optimizers often fail to produce efficient execution plans for instrumented queries. We develop provenance-aware optimization techniques to address this problem. Specifically, we study algebraic equivalences targeted at instrumented queries and alternative ways of instrumenting queries for provenance capture. Furthermore, we present an extensible heuristic and cost-based optimization framework utilizing these optimizations. Our experiments confirm that these optimizations are highly effective, improving performance by several orders of magnitude for diverse provenance tasks.
more » « less
Full Text Available

Search for: All records